Introduction

Analyse d’un screening CRISPR dans des cellules tryptophane neg. Raphaële a fourni un fichier word détaillé qui explique la manip et les algorithmes utilisés.

Il y a trois fichiers MLE et trois RRA (triplicats biologiques) qui contiennent chacun un b score = une moyenne, un z score = un score qui prend en compte la moyenne et la déviation standard et des pval + wald tests + fdr.

Je vais pool les trois fichiers, avoir le score z des triplicats et déterminer les outliers puis les comparer à ceux que Raphaële a (pour contrôler que je fais pas du caca). Ensuite je prendrai les outliers recroisés entre le MLE et RRA pour pouvoir produire les différentes figures dont Raphaële a besoin.

files = list.files("./data/", pattern = "*mle_trp*")

for (i in 1:length(files)){
  file = read_xlsx(paste0("./data/",files[i]))
  file = file %>% filter(sgRNA >=4) %>% 
    dplyr::select(Gene, `treatment|beta`,`treatment|p-value`)
  colnames(file) = c("Gene", paste0("mean_",i), paste0("pval_",i))
  assign(paste0("mle_",i), file)
}

data = list(mle_1, mle_2, mle_3) %>% 
  purrr::reduce(full_join)

data_all = 
  data %>% 
  drop_na() %>% 
  rowwise() %>% 
  mutate(mean = mean(c(mean_1, mean_2, mean_3)),
         sd = sd(c(mean_1, mean_2, mean_3)),
         pval = mean(c(pval_1, pval_2, pval_3))) %>% 
  dplyr::select(Gene, mean, sd, pval)

# Replacing the pvalues = 0 into pval = 10^-5.5 to fit on the graph

data_all[data_all$pval==0,]$pval = 10^-5.5

data = 
  data_all %>% 
    filter(pval<= 0.05)

boxplot(data$mean)

median(data$mean)
## [1] 0.4738267
outliers = 
  data %>% 
  filter(mean < quantile(data$mean)[2]|mean > quantile(data$mean)[4])

down = 
  data %>% 
  filter(mean < -0.3)

outliers %>% 
  head(15) %>% 
  knitr::kable()
Gene mean sd pval
Mrpl28 1.1918667 0.1163599 0.0000032
Ndufa1 1.1643667 0.0969317 0.0000181
37316 1.0765400 0.1310024 0.0000032
Mrpl20 1.1647333 0.0715976 0.0000032
Ndufa2 1.1362333 0.0476430 0.0000032
Ptplb 0.9026700 0.3572377 0.0004527
Mrps25 1.1595667 0.1321792 0.0000032
Agpat1 1.0188100 0.1424605 0.0000181
Hsd17b10 0.9095200 0.1850416 0.0000362
Ndufc2 1.1130333 0.0349172 0.0000032
Gnb2l1 1.0756667 0.0651772 0.0000032
Mrpl41 0.9315833 0.1756950 0.0000181
Dbr1 0.9692067 0.1070501 0.0000181
Tufm 0.8323300 0.2199795 0.0000543
Ncl 0.6533667 0.3726756 0.0087642
dim(outliers)
## [1] 288   4

Quand je filtre pour pval < 0.05, j’ai 288 outliers pour les MLE compparé aux 261 de Raphaële. On va partir de la liste des outliers MLE+RRA qu’elle a fournie pour la suite.

Possible plots

MLE : pval and mean

RRA : pval and mean

Il faudrait retrouver les fichiers avec les pvaleurs, je peux pas plotter sinon.

Number of gene attributed at random

Number of gene attributed not at random

ORA

Gene ontology

On the 577 significant genes (not specifically the outliers)

ID Description GeneRatio BgRatio pvalue p.adjust
GO:0140053 mitochondrial gene expression 55/538 95/15811 0 0
GO:0032543 mitochondrial translation 48/538 70/15811 0 0
GO:0010257 NADH dehydrogenase complex assembly 35/538 47/15811 0 0
GO:0032981 mitochondrial respiratory chain complex I assembly 35/538 47/15811 0 0
GO:0033108 mitochondrial respiratory chain complex assembly 39/538 72/15811 0 0
GO:0007005 mitochondrion organization 72/538 456/15811 0 0
GO:0006399 tRNA metabolic process 42/538 155/15811 0 0
GO:0009060 aerobic respiration 38/538 128/15811 0 0
GO:0006119 oxidative phosphorylation 31/538 84/15811 0 0
GO:0045333 cellular respiration 42/538 178/15811 0 0

On the 177 positive outliers

ONTOLOGY ID Description GeneRatio BgRatio pvalue
BP GO:0010257 NADH dehydrogenase complex assembly 28/170 35/538 0.00e+00
BP GO:0032981 mitochondrial respiratory chain complex I assembly 28/170 35/538 0.00e+00
BP GO:0033108 mitochondrial respiratory chain complex assembly 29/170 39/538 0.00e+00
BP GO:1901566 organonitrogen compound biosynthetic process 78/170 166/538 4.00e-07
BP GO:0006412 translation 59/170 116/538 8.00e-07
BP GO:0043043 peptide biosynthetic process 59/170 117/538 1.10e-06
BP GO:0043604 amide biosynthetic process 60/170 122/538 2.80e-06
BP GO:0006518 peptide metabolic process 59/170 120/538 3.70e-06
BP GO:0034645 cellular macromolecule biosynthetic process 62/170 130/538 7.20e-06
BP GO:0043603 cellular amide metabolic process 61/170 129/538 1.31e-05

On the 144 negative outliers

ONTOLOGY ID Description GeneRatio BgRatio pvalue
BP GO:0006357 regulation of transcription by RNA polymerase II 24/71 57/538 0e+00
BP GO:0065007 biological regulation 56/71 256/538 0e+00
BP GO:0006366 transcription by RNA polymerase II 25/71 63/538 0e+00
BP GO:0006355 regulation of transcription, DNA-templated 27/71 73/538 0e+00
BP GO:1903506 regulation of nucleic acid-templated transcription 27/71 73/538 0e+00
BP GO:2001141 regulation of RNA biosynthetic process 27/71 73/538 0e+00
BP GO:0051252 regulation of RNA metabolic process 30/71 88/538 0e+00
BP GO:0050789 regulation of biological process 53/71 238/538 0e+00
BP GO:0019219 regulation of nucleobase-containing compound metabolic process 32/71 103/538 1e-07
BP GO:0050794 regulation of cellular process 51/71 230/538 1e-07

Reactome

On the 577 significant genes (not specifically the outliers)

ID Description GeneRatio BgRatio pvalue p.adjust
R-MMU-5389840 Mitochondrial translation elongation 73/369 83/8497 0.0e+00 0.0000000
R-MMU-5368287 Mitochondrial translation 74/369 86/8497 0.0e+00 0.0000000
R-MMU-5419276 Mitochondrial translation termination 73/369 85/8497 0.0e+00 0.0000000
R-MMU-72766 Translation 88/369 220/8497 0.0e+00 0.0000000
R-MMU-163200 Respiratory electron transport, ATP synthesis by chemiosmotic coupling, and heat production by uncoupling proteins. 53/369 116/8497 0.0e+00 0.0000000
R-MMU-1428517 The citric acid (TCA) cycle and respiratory electron transport 60/369 163/8497 0.0e+00 0.0000000
R-MMU-6799198 Complex I biogenesis 38/369 55/8497 0.0e+00 0.0000000
R-MMU-611105 Respiratory electron transport 46/369 93/8497 0.0e+00 0.0000000
R-MMU-189451 Heme biosynthesis 7/369 14/8497 7.0e-07 0.0000543
R-MMU-163210 Formation of ATP by chemiosmotic coupling 7/369 17/8497 3.7e-06 0.0002245

On the positive 177 outliers

ID Description GeneRatio BgRatio pvalue p.adjust
R-MMU-5419276 Mitochondrial translation termination 53/131 85/8497 0e+00 0.0e+00
R-MMU-5368287 Mitochondrial translation 53/131 86/8497 0e+00 0.0e+00
R-MMU-5389840 Mitochondrial translation elongation 52/131 83/8497 0e+00 0.0e+00
R-MMU-72766 Translation 54/131 220/8497 0e+00 0.0e+00
R-MMU-163200 Respiratory electron transport, ATP synthesis by chemiosmotic coupling, and heat production by uncoupling proteins. 38/131 116/8497 0e+00 0.0e+00
R-MMU-1428517 The citric acid (TCA) cycle and respiratory electron transport 40/131 163/8497 0e+00 0.0e+00
R-MMU-6799198 Complex I biogenesis 28/131 55/8497 0e+00 0.0e+00
R-MMU-611105 Respiratory electron transport 32/131 93/8497 0e+00 0.0e+00
R-MMU-163210 Formation of ATP by chemiosmotic coupling 6/131 17/8497 1e-07 2.2e-06
R-MMU-8949613 Cristae formation 6/131 17/8497 1e-07 2.2e-06

On the negative 144 outliers

ID Description GeneRatio BgRatio pvalue p.adjust
R-MMU-392518 Signal amplification 4/35 32/8497 0.0000080 0.0023261
R-MMU-428930 Thromboxane signalling through TP receptor 3/35 23/8497 0.0001072 0.0118473
R-MMU-418592 ADP signalling through P2Y purinoceptor 1 3/35 24/8497 0.0001221 0.0118473
R-MMU-418597 G alpha (z) signalling events 3/35 31/8497 0.0002659 0.0154777
R-MMU-456926 Thrombin signalling through proteinase activated receptors (PARs) 3/35 31/8497 0.0002659 0.0154777
R-MMU-8939211 ESR-mediated signaling 5/35 158/8497 0.0004315 0.0207295
R-MMU-977444 GABA B receptor activation 3/35 40/8497 0.0005699 0.0207295
R-MMU-991365 Activation of GABAB receptors 3/35 40/8497 0.0005699 0.0207295
R-MMU-3371568 Attenuation phase 2/35 13/8497 0.0012496 0.0389705
R-MMU-9006931 Signaling by Nuclear Receptors 5/35 208/8497 0.0014950 0.0389705

Highlight genes per found role

Outliers found in GO terms (BP and CC)

ONTOLOGY ID Description GeneRatio BgRatio pvalue
BP GO:0010257 NADH dehydrogenase complex assembly 28/170 35/538 0.0000000
BP GO:0032981 mitochondrial respiratory chain complex I assembly 28/170 35/538 0.0000000
BP GO:0033108 mitochondrial respiratory chain complex assembly 29/170 39/538 0.0000000
BP GO:1901566 organonitrogen compound biosynthetic process 78/170 166/538 0.0000004
BP GO:0006412 translation 59/170 116/538 0.0000008
BP GO:0043043 peptide biosynthetic process 59/170 117/538 0.0000011
BP GO:0043604 amide biosynthetic process 60/170 122/538 0.0000028
BP GO:0006518 peptide metabolic process 59/170 120/538 0.0000037
BP GO:0034645 cellular macromolecule biosynthetic process 62/170 130/538 0.0000072
BP GO:0043603 cellular amide metabolic process 61/170 129/538 0.0000131
BP GO:0006119 oxidative phosphorylation 21/170 31/538 0.0000223
BP GO:0007005 mitochondrion organization 38/170 72/538 0.0000487
BP GO:0140053 mitochondrial gene expression 31/170 55/538 0.0000525
BP GO:0032543 mitochondrial translation 28/170 48/538 0.0000551
BP GO:0043933 protein-containing complex organization 48/170 99/538 0.0000752
BP GO:0045333 cellular respiration 25/170 42/538 0.0000932
BP GO:0009060 aerobic respiration 23/170 38/538 0.0001293
BP GO:0015980 energy derivation by oxidation of organic compounds 25/170 43/538 0.0001609
BP GO:0065003 protein-containing complex assembly 45/170 95/538 0.0002875
BP GO:0022900 electron transport chain 17/170 27/538 0.0005729
BP GO:0022904 respiratory electron transport chain 16/170 25/538 0.0006507
BP GO:0019646 aerobic electron transport chain 13/170 19/538 0.0008742
BP GO:0022607 cellular component assembly 52/170 120/538 0.0014491
BP GO:0046034 ATP metabolic process 21/170 38/538 0.0014779
BP GO:0042773 ATP synthesis coupled electron transport 14/170 22/538 0.0016035
BP GO:0042775 mitochondrial ATP synthesis coupled electron transport 14/170 22/538 0.0016035
CC GO:0005739 mitochondrion 138/171 253/541 0.0000000
CC GO:0098798 mitochondrial protein-containing complex 90/171 128/541 0.0000000
CC GO:0000313 organellar ribosome 53/171 72/541 0.0000000
CC GO:0005761 mitochondrial ribosome 53/171 72/541 0.0000000

Organize by theme

  • BP Mitochondrial gene expression (encompasses mitochondrial translation)

  • BP ATP metabolic process (encompasses proton-motive force driven ATP prod)

  • CC respiratory chain complex I

  • BP oxidative phosphorylation

Outliers found in reactome

ID Description GeneRatio BgRatio
R-MMU-5419276 Mitochondrial translation termination 53/132 85/8497
R-MMU-5368287 Mitochondrial translation 53/132 86/8497
R-MMU-5389840 Mitochondrial translation elongation 52/132 83/8497
R-MMU-72766 Translation 54/132 220/8497
R-MMU-163200 Respiratory electron transport, ATP synthesis by chemiosmotic coupling, and heat production by uncoupling proteins. 39/132 116/8497
R-MMU-1428517 The citric acid (TCA) cycle and respiratory electron transport 41/132 163/8497
R-MMU-6799198 Complex I biogenesis 29/132 55/8497
R-MMU-611105 Respiratory electron transport 33/132 93/8497
R-MMU-163210 Formation of ATP by chemiosmotic coupling 6/132 17/8497
R-MMU-8949613 Cristae formation 6/132 17/8497
R-MMU-189451 Heme biosynthesis 5/132 14/8497
R-MMU-1592230 Mitochondrial biogenesis 6/132 27/8497
R-MMU-189445 Metabolism of porphyrins 5/132 31/8497
R-MMU-3371378 Regulation by c-FLIP 2/132 10/8497
R-MMU-5218900 CASP8 activity is inhibited 2/132 10/8497
R-MMU-69416 Dimerization of procaspase-8 2/132 10/8497
R-MMU-1362409 Mitochondrial iron-sulfur cluster biogenesis 2/132 11/8497
R-MMU-202427 Phosphorylation of CD3 and TCR zeta chains 2/132 12/8497
R-MMU-199220 Vitamin B5 (pantothenate) metabolism 2/132 13/8497
R-MMU-389948 PD-1 signaling 2/132 13/8497
R-MMU-140534 Caspase activation via Death Receptors in the presence of ligand 2/132 15/8497
R-MMU-202433 Generation of second messenger molecules 2/132 19/8497
R-MMU-5357769 Caspase activation via extrinsic apoptotic signalling pathway 2/132 19/8497
R-MMU-71403 Citric acid cycle (TCA cycle) 2/132 21/8497
R-MMU-5213460 RIPK1-mediated regulated necrosis 2/132 28/8497
R-MMU-5675482 Regulation of necroptotic cell death 2/132 28/8497
R-MMU-1852241 Organelle biogenesis and maintenance 6/132 212/8497
R-MMU-9706019 RHOBTB3 ATPase cycle 1/132 10/8497
R-MMU-264870 Caspase-mediated cleavage of cytoskeletal proteins 1/132 11/8497
R-MMU-71406 Pyruvate metabolism and Citric Acid (TCA) cycle 2/132 47/8497

Organize and highlight targets by theme and functions

## R version 4.2.2 (2022-10-31 ucrt)
## Platform: x86_64-w64-mingw32/x64 (64-bit)
## Running under: Windows 10 x64 (build 19045)
## 
## Matrix products: default
## 
## locale:
## [1] LC_COLLATE=English_Belgium.utf8  LC_CTYPE=English_Belgium.utf8   
## [3] LC_MONETARY=English_Belgium.utf8 LC_NUMERIC=C                    
## [5] LC_TIME=English_Belgium.utf8    
## 
## attached base packages:
## [1] stats4    stats     graphics  grDevices utils     datasets  methods  
## [8] base     
## 
## other attached packages:
##  [1] org.Mm.eg.db_3.16.0   AnnotationDbi_1.60.0  IRanges_2.32.0       
##  [4] S4Vectors_0.36.0      Biobase_2.58.0        BiocGenerics_0.44.0  
##  [7] ReactomePA_1.42.0     clusterProfiler_4.6.0 msigdbr_7.5.1        
## [10] readxl_1.4.1          ggrepel_0.9.2         forcats_0.5.2        
## [13] stringr_1.4.1         dplyr_1.0.10          purrr_0.3.5          
## [16] readr_2.1.3           tidyr_1.2.1           tibble_3.1.8         
## [19] ggplot2_3.4.0         tidyverse_1.3.2      
## 
## loaded via a namespace (and not attached):
##   [1] shadowtext_0.1.2       backports_1.4.1        fastmatch_1.1-3       
##   [4] systemfonts_1.0.4      plyr_1.8.8             igraph_1.3.5          
##   [7] lazyeval_0.2.2         splines_4.2.2          BiocParallel_1.32.4   
##  [10] GenomeInfoDb_1.34.4    digest_0.6.30          yulab.utils_0.0.6     
##  [13] htmltools_0.5.3        GOSemSim_2.24.0        viridis_0.6.2         
##  [16] GO.db_3.16.0           fansi_1.0.3            magrittr_2.0.3        
##  [19] memoise_2.0.1          googlesheets4_1.0.1    tzdb_0.3.0            
##  [22] Biostrings_2.66.0      graphlayouts_0.8.4     modelr_0.1.10         
##  [25] svglite_2.1.1          timechange_0.1.1       enrichplot_1.18.3     
##  [28] colorspace_2.0-3       rappdirs_0.3.3         blob_1.2.3            
##  [31] rvest_1.0.3            textshaping_0.3.6      haven_2.5.2           
##  [34] xfun_0.34              crayon_1.5.2           RCurl_1.98-1.9        
##  [37] jsonlite_1.8.3         graph_1.76.0           scatterpie_0.1.8      
##  [40] ape_5.6-2              glue_1.6.2             polyclip_1.10-4       
##  [43] gtable_0.3.1           gargle_1.2.1           zlibbioc_1.44.0       
##  [46] XVector_0.38.0         graphite_1.44.0        scales_1.2.1          
##  [49] DOSE_3.24.2            DBI_1.1.3              Rcpp_1.0.9            
##  [52] viridisLite_0.4.1      gridGraphics_0.5-1     tidytree_0.4.2        
##  [55] reactome.db_1.82.0     bit_4.0.4              httr_1.4.4            
##  [58] fgsea_1.24.0           RColorBrewer_1.1-3     ellipsis_0.3.2        
##  [61] pkgconfig_2.0.3        farver_2.1.1           sass_0.4.2            
##  [64] dbplyr_2.2.1           utf8_1.2.2             labeling_0.4.2        
##  [67] ggplotify_0.1.0        tidyselect_1.2.0       rlang_1.0.6           
##  [70] reshape2_1.4.4         munsell_0.5.0          cellranger_1.1.0      
##  [73] tools_4.2.2            cachem_1.0.6           downloader_0.4        
##  [76] cli_3.4.1              generics_0.1.3         RSQLite_2.2.19        
##  [79] gson_0.0.9             broom_1.0.1            evaluate_0.18         
##  [82] fastmap_1.1.0          ragg_1.2.5             yaml_2.3.6            
##  [85] ggtree_3.6.2           babelgene_22.9         knitr_1.41            
##  [88] bit64_4.0.5            fs_1.5.2               tidygraph_1.2.2       
##  [91] KEGGREST_1.38.0        ggraph_2.1.0           nlme_3.1-160          
##  [94] aplot_0.1.9            xml2_1.3.3             compiler_4.2.2        
##  [97] rstudioapi_0.14        png_0.1-7              reprex_2.0.2          
## [100] treeio_1.22.0          tweenr_2.0.2           bslib_0.4.1           
## [103] stringi_1.7.8          highr_0.9              lattice_0.20-45       
## [106] Matrix_1.5-1           vctrs_0.5.0            pillar_1.8.1          
## [109] lifecycle_1.0.3        jquerylib_0.1.4        data.table_1.14.4     
## [112] cowplot_1.1.1          bitops_1.0-7           patchwork_1.1.2       
## [115] qvalue_2.30.0          R6_2.5.1               gridExtra_2.3         
## [118] codetools_0.2-18       MASS_7.3-58.1          assertthat_0.2.1      
## [121] withr_2.5.0            GenomeInfoDbData_1.2.9 parallel_4.2.2        
## [124] hms_1.1.2              grid_4.2.2             ggfun_0.0.9           
## [127] HDO.db_0.99.1          rmarkdown_2.18         googledrive_2.0.0     
## [130] ggforce_0.4.1          lubridate_1.9.0